Automatic Context Pattern Generation for Entity Set Expansion
نویسندگان
چکیده
Entity Set Expansion (ESE) is a valuable task that aims to find entities of the target semantic class described by given seed entities. Various Natural Language Processing (NLP) and Information Retrieval (IR) downstream applications have benefited from ESE due its ability discover knowledge. Although existing corpus-based methods achieved great progress, they still rely on corpora with high-quality entity information annotated, because most them need obtain context patterns through position in sentence. Therefore, quality their annotation has become bottleneck limits performance such methods. To overcome this dilemma make models free dependence annotation, our work explore new paradigm, namely corpus-independent ESE. Specifically, we devise pattern generation module utilizes autoregressive language (e.g., GPT-2) automatically generate for In addition, propose GAPA, novel framework leverages aforementioned G ener xmlns:xlink="http://www.w3.org/1999/xlink">A ted xmlns:xlink="http://www.w3.org/1999/xlink">PA tterns expand Extensive experiments detailed analyses three widely used datasets demonstrate effectiveness method. All codes are available at https://github.com/geekjuruo/GAPA .
منابع مشابه
Semi-Automatic Entity Set Refinement
State of the art set expansion algorithms produce varying quality expansions for different entity types. Even for the highest quality expansions, errors still occur and manual refinements are necessary for most practical uses. In this paper, we propose algorithms to aide this refinement process, greatly reducing the amount of manual labor required. The methods rely on the fact that most expansi...
متن کاملEntity Set Expansion using Topic information
This paper proposes three modules based on latent topics of documents for alleviating “semantic drift” in bootstrapping entity set expansion. These new modules are added to a discriminative bootstrapping algorithm to realize topic feature generation, negative example selection and entity candidate pruning. In this study, we model latent topics with LDA (Latent Dirichlet Allocation) in an unsupe...
متن کاملAutomatic term list generation for entity tagging
MOTIVATION Many entity taggers and information extraction systems make use of lists of terms of entities such as people, places, genes or chemicals. These lists have traditionally been constructed manually. We show that distributional clustering methods which group words based on the contexts that they appear in, including neighboring words and syntactic relations extracted using a shallow pars...
متن کاملEntity List Completion Using Set Expansion Techniques
Set expansion refers to expanding a partial set of “seed” objects into a more complete set. In this paper, we focus on relation and list extraction techniques to perform Entity List Completion task through a two stage retrieval process. First stage takes given query entity and target entity examples as seeds and does set expansion. In second stage, only those candidates who have valid URI in Bi...
متن کاملEntity Set Expansion using Interactive Topic Information
We propose a newmethod for entity set expansion that achieves highly accurate extraction by suppressing the effect of semantic drift; it requires a small amount of interactive information. We supplement interactive information to re-train the topic models (based on interactive Unigram Mixtures) not only the contextual information. Although the topic information extracted from an unsupervised co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2023
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2023.3275211